2D Audiovisual Text-to-Speech Synthesis for Human-Machine Interaction in Dutch
نویسندگان
چکیده
Speech has always been the most important means of communication between humans. Therefore, using speech in machine-human communication can help in increasing the naturalness of the communication between a computer system and a user. Systems that can make a machine pronounce any given input text are referred to as text-to-speech systems. To further enhance the communication, a talking head can be added to the text-to-speech synthesis, since the addition of this synthetic visual speech mode will improve the intelligibility of the artificial speech (Pandzic et al., 1999). Furthermore, users will perceive this multimodal speech communication as more natural and they will feel more positive and confident if they can see the (artificial) person that is talking to them. In this paper we propose an audiovisual text-to-speech synthesis system for Dutch that is able to create both the target auditory and the target visual speech by using a same audiovisual database, which makes it possible to maximize the intermodal coherence in the audiovisual output signal.
منابع مشابه
Auditory and photo-realistic audiovisual speech synthesis for Dutch
Both auditory and audiovisual speech synthesis have been the subject of many research projects throughout the years. Unfortunately, in recent years only very few research focuses on synthesis for the Dutch language. Especially for audiovisual synthesis, hardly any available system or resource can be found. In this paper we describe the creation of a new extensive Dutch speech database, containi...
متن کاملRule-based visual speech synthesis
A system for rule based audiovisual text-to-speech synthesis has been created. The system is based on the KTH text-to-speech system which has been complemented with a three-dimensional parameterized model of a human face. The face can be animated in real time, synchronized with the auditory speech. The facial model is controlled by the same synthesis software as the auditory speech synthesizer....
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملJASMIN-CGN: Extension of the Spoken Dutch Corpus with Speech of Elderly People, Children and Non-natives in the Human-Machine Interaction Modality
Large speech corpora (LSC) constitute an indispensable resource for conducting research in speech processing and for developing real-life speech applications. In 2004 the Spoken Dutch Corpus (CGN) became available, a corpus of standard Dutch as spoken by adult natives in the Netherlands and Flanders. Owing to budget constraints, CGN does not include speech of children, non-natives, elderly peop...
متن کاملMultimodal coherency issues in designing and optimizing audiovisual speech synthesis techniques
This paper proposes a 2D audiovisual text-to-speech synthesis system that constructs the output signal by selecting and concatenating multimodal segments containing natural combinations of audio and video. We describe the experiments that were conducted in order to assess the impact of this joint audio/video synthesis technique on the perceived quality of the synthetic speech. The experiments i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008